13 research outputs found

    Compendium-Wide Analysis of Pseudomonas aeruginosa Core and Accessory Genes Reveals Transcriptional Patterns across Strains PAO1 and PA14.

    Get PDF
    Pseudomonas aeruginosa is an opportunistic pathogen that causes difficult-to-treat infections. Two well-studied divergent P. aeruginosa strain types, PAO1 and PA14, have significant genomic heterogeneity, including diverse accessory genes present in only some strains. Genome content comparisons find core genes that are conserved across both PAO1 and PA14 strains and accessory genes that are present in only a subset of PAO1 and PA14 strains. Here, we use recently assembled transcriptome compendia of publicly available P. aeruginosa RNA sequencing (RNA-seq) samples to create two smaller compendia consisting of only strain PAO1 or strain PA14 samples with each aligned to their cognate reference genome. We confirmed strain annotations and identified other samples for inclusion by assessing each sample\u27s median expression of PAO1-only or PA14-only accessory genes. We then compared the patterns of core gene expression in each strain. To do so, we developed a method by which we analyzed genes in terms of which genes showed similar expression patterns across strain types. We found that some core genes had consistent correlated expression patterns across both compendia, while others were less stable in an interstrain comparison. For each accessory gene, we also determined core genes with correlated expression patterns. We found that stable core genes had fewer coexpressed neighbors that were accessory genes. Overall, this approach for analyzing expression patterns across strain types can be extended to other groups of genes, like phage genes, or applied for analyzing patterns beyond groups of strains, such as samples with different traits, to reveal a deeper understanding of regulation

    Unsupervised Extraction of Stable Expression Signatures from Public Compendia with an Ensemble of Neural Networks

    Get PDF
    Cross-experiment comparisons in public data compendia are challenged by unmatched conditions and technical noise. The ADAGE method, which performs unsupervised integration with denoising autoencoder neural networks, can identify biological patterns, but because ADAGE models, like many neural networks, are over-parameterized, different ADAGE models perform equally well. To enhance model robustness and better build signatures consistent with biological pathways, we developed an ensemble ADAGE (eADAGE) that integrated stable signatures across models. We applied eADAGE to a compendium of Pseudomonas aeruginosa gene expression profiling experiments performed in 78 media. eADAGE revealed a phosphate starvation response controlled by PhoB in media with moderate phosphate and predicted that a second stimulus provided by the sensor kinase, KinB, is required for this PhoB activation. We validated this relationship using both targeted and unbiased genetic approaches. eADAGE, which captures stable biological patterns, enables cross-experiment comparisons that can highlight measured but undiscovered relationships.Gordon and Betty Moore Foundation (GBMF 4552)National Institutes of Health (U.S.) (grant R01-AI091702)Cystic Fibrosis Foundation (STANTO15R0

    Machine Learning on Images of a Microbial Mutant Library

    No full text
    Environmental isolates, like BJB312, are interesting because of their potential therapeu- tic properties. Transposon mutagenesis is a technique used to determine the function of genes by randomly disrupting a genome and observing the phenotypic effects. The genome of BJB312 consists of over 5,000 genes, requiring 57,000 independent insertion mutants in order to break every gene in the genome. It is unwieldy to screen such a large library for defects. I used image processing techniques to convert qualitative data of mutant bacterial colonies morphology into a quantitative data set that is susceptible to data mining. Fur- ther, I built a tool of ensemble machine learning techniques that automatically analyze a large library of mutants. It first uses the unsupervised methods k-means and Wards hierarchical clustering to find a patterned, recurrent phenotype. It then uses a Support Vector Machine to screen the library at large. This tool is robust and useful on real-world data because it utilizes Machine Learning techniques to filter the image library before reaching the final clustering solution. Ten transposon insertion mutants that clustered together were characterized by lessened biofilm. This proof-of-concept study shows that genomic and high-throughput functional characterizations can be combined in order to rapidly explore a novel microbe

    Conditional antagonism in co-cultures of Pseudomonas aeruginosa and Candida albicans: An intersection of ethanol and phosphate signaling distilled from dual-seq transcriptomics.

    No full text
    Pseudomonas aeruginosa and Candida albicans are opportunistic pathogens whose interactions involve the secreted products ethanol and phenazines. Here, we describe the role of ethanol in mixed-species co-cultures by dual-seq analyses. P. aeruginosa and C. albicans transcriptomes were assessed after growth in mono-culture or co-culture with either ethanol-producing C. albicans or a C. albicans mutant lacking the primary ethanol dehydrogenase, Adh1. Analysis of the RNA-Seq data using KEGG pathway enrichment and eADAGE methods revealed several P. aeruginosa responses to C. albicans-produced ethanol including the induction of a non-canonical low-phosphate response regulated by PhoB. C. albicans wild type, but not C. albicans adh1Δ/Δ, induces P. aeruginosa production of 5-methyl-phenazine-1-carboxylic acid (5-MPCA), which forms a red derivative within fungal cells and exhibits antifungal activity. Here, we show that C. albicans adh1Δ/Δ no longer activates P. aeruginosa PhoB and PhoB-regulated phosphatase activity, that exogenous ethanol complements this defect, and that ethanol is sufficient to activate PhoB in single-species P. aeruginosa cultures at permissive phosphate levels. The intersection of ethanol and phosphate in co-culture is inversely reflected in C. albicans; C. albicans adh1Δ/Δ had increased expression of genes regulated by Pho4, the C. albicans transcription factor that responds to low phosphate, and Pho4-dependent phosphatase activity. Together, these results show that C. albicans-produced ethanol stimulates P. aeruginosa PhoB activity and 5-MPCA-mediated antagonism, and that both responses are dependent on local phosphate concentrations. Further, our data suggest that phosphate scavenging by one species improves phosphate access for the other, thus highlighting the complex dynamics at play in microbial communities

    PathCORE-T: identifying and visualizing globally co-occurring pathways in large transcriptomic compendia

    No full text
    Abstract Background Investigators often interpret genome-wide data by analyzing the expression levels of genes within pathways. While this within-pathway analysis is routine, the products of any one pathway can affect the activity of other pathways. Past efforts to identify relationships between biological processes have evaluated overlap in knowledge bases or evaluated changes that occur after specific treatments. Individual experiments can highlight condition-specific pathway-pathway relationships; however, constructing a complete network of such relationships across many conditions requires analyzing results from many studies. Results We developed PathCORE-T framework by implementing existing methods to identify pathway-pathway transcriptional relationships evident across a broad data compendium. PathCORE-T is applied to the output of feature construction algorithms; it identifies pairs of pathways observed in features more than expected by chance as functionally co-occurring. We demonstrate PathCORE-T by analyzing an existing eADAGE model of a microbial compendium and building and analyzing NMF features from the TCGA dataset of 33 cancer types. The PathCORE-T framework includes a demonstration web interface, with source code, that users can launch to (1) visualize the network and (2) review the expression levels of associated genes in the original data. PathCORE-T creates and displays the network of globally co-occurring pathways based on features observed in a machine learning analysis of gene expression data. Conclusions The PathCORE-T framework identifies transcriptionally co-occurring pathways from the results of unsupervised analysis of gene expression data and visualizes the relationships between pathways as a network. PathCORE-T recapitulated previously described pathway-pathway relationships and suggested experimentally testable additional hypotheses that remain to be explored

    Additional file 2: of PathCORE-T: identifying and visualizing globally co-occurring pathways in large transcriptomic compendia

    No full text
    PathCORE-T network constructed from a Pseudomonas aeruginosa compendium using 10 eADAGE models analyzed with KEGG pathway annotations. (TSV 16 kb

    eADAGE-1.0.0rc1

    No full text
    This is the source code required to reproduce data analysis figures from the initial preprint for, "System-wide automatic extraction of functional signatures in Pseudomonas aeruginosa with eADAGE." This includes the eADAGE method as well as code for the comparison methods

    eADAGE-1.0.0rc2

    No full text
    <p>This is the source code required to reproduce data analysis figures from the manuscript, "System-wide automatic extraction of functional signatures in <em>Pseudomonas aeruginosa</em> with eADAGE." This includes the eADAGE method as well as code for the comparison methods.</p
    corecore